Skip to content

[SPARK-47723][CORE][TESTS] Introduce a tool that can sort alphabetically enumeration field in LogEntry automatically#45867

Closed
panbingkun wants to merge 8 commits into
apache:masterfrom
panbingkun:SPARK-47723
Closed

[SPARK-47723][CORE][TESTS] Introduce a tool that can sort alphabetically enumeration field in LogEntry automatically#45867
panbingkun wants to merge 8 commits into
apache:masterfrom
panbingkun:SPARK-47723

Conversation

@panbingkun

Copy link
Copy Markdown
Contributor

What changes were proposed in this pull request?

The pr aims to introduce a tool that can sort alphabetically enumeration field in LogEntry automatically.

Why are the changes needed?

Enable developers to more conveniently write the enumeration values in LogEntry in alphabetical order according to the requirements of structured log development documents.

Does this PR introduce any user-facing change?

No.

How was this patch tested?

  • Manually test.
    SPARK_GENERATE_GOLDEN_FILES=1 
    build/sbt "common-utils/testOnly *LogKeySuite -- -t \"LogKey enumeration fields are correctly sorted\""
    
  • Pass GA.

Was this patch authored or co-authored using generative AI tooling?

No.

…lly enumeration field in `LogEntry` automatically
@panbingkun panbingkun changed the title [SPARK-47723][CORE][TESTS] Introduce a tool that can sort alphabetically enumeration field in LogEntry automatically [WIP][SPARK-47723][CORE][TESTS] Introduce a tool that can sort alphabetically enumeration field in LogEntry automatically Apr 4, 2024
@panbingkun panbingkun changed the title [WIP][SPARK-47723][CORE][TESTS] Introduce a tool that can sort alphabetically enumeration field in LogEntry automatically [SPARK-47723][CORE][TESTS] Introduce a tool that can sort alphabetically enumeration field in LogEntry automatically Apr 4, 2024
@panbingkun panbingkun marked this pull request as ready for review April 4, 2024 03:01
@panbingkun

Copy link
Copy Markdown
Contributor Author

test("LogKey enumeration fields are correctly sorted") {
val originalKeys = LogKey.values.toSeq
val sortedKeys = originalKeys.sortBy(_.toString)
if (regenerateGoldenFiles) {

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's move the regeneration to a new method instead of keeping in the test case.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mean: extract the regenerated method into an independent method and call it in that place?

test("Error classes are correctly formatted") {
val errorClassFileContents =
IOUtils.toString(errorJsonFilePath.toUri.toURL.openStream(), StandardCharsets.UTF_8)
val mapper = JsonMapper.builder()
.addModule(DefaultScalaModule)
.enable(SerializationFeature.INDENT_OUTPUT)
.build()
val prettyPrinter = new DefaultPrettyPrinter()
.withArrayIndenter(DefaultIndenter.SYSTEM_LINEFEED_INSTANCE)
val rewrittenString = mapper.configure(SerializationFeature.ORDER_MAP_ENTRIES_BY_KEYS, true)
.setSerializationInclusion(Include.NON_ABSENT)
.writer(prettyPrinter)
.writeValueAsString(errorReader.errorInfoMap)
if (regenerateGoldenFiles) {
if (rewrittenString.trim != errorClassFileContents.trim) {
val errorClassesFile = errorJsonFilePath.toFile
logInfo(s"Regenerating error class file $errorClassesFile")
Files.delete(errorClassesFile.toPath)
FileUtils.writeStringToFile(
errorClassesFile,
rewrittenString + lineSeparator,
StandardCharsets.UTF_8)
}
} else {
assert(rewrittenString.trim == errorClassFileContents.trim)
}
}

The basic idea is that when without setting environment variables SPARK_GENERATE_GOLDEN_FILES (default is false), the UT will only check whether the enumeration fields in LogKey have been sorted alphabetically.

The steps used in my process of migrating error classes are roughly as follows:

  • Before I submit the changes, if I remember the formatting step, I will use the following command to regenerate the error classes file error-classes.json and submit them.
SPARK_GENERATE_GOLDEN_FILES=1 build/sbt "core/testOnly *SparkThrowableSuite -- -t \"Error classes are correctly formatted\""

  • If I forget this step, if there are incorrect item in error-classes.json that are not sorted alphabetically, GA will help me discover (UT failed), then I will execute above command on local and resubmit the changes.

I guess our scene is very similar to this.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, create a new method in this file .

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, let me do it right now.

/* Used to regenerate the LogKey class file. Run:
{{{
SPARK_GENERATE_GOLDEN_FILES=1 build/sbt \
"common-utils/testOnly *LogKeySuite -- -t \"LogKey enumeration fields are correctly sorted\""

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shall we remove the -t ... from the instruction to make it simpler?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure

@panbingkun

Copy link
Copy Markdown
Contributor Author

@gengliangwang all done.
Thanks.

Comment thread common/utils/src/test/scala/org/apache/spark/util/LogKeySuite.scala Outdated
Co-authored-by: Gengliang Wang <gengliang@apache.org>
val keys = LogKey.values.toSeq
assert(keys === keys.sortBy(_.toString),
"LogKey enumeration fields must be sorted alphabetically")
// scalastyle:off line.size.limit

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is no longer needed.

assert(keys === keys.sortBy(_.toString),
"LogKey enumeration fields must be sorted alphabetically")
// scalastyle:off line.size.limit
/* Used to regenerate the LogKey class file. Run:

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's move this before line 32 class LogKeySuite

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@gengliangwang

Copy link
Copy Markdown
Member

Thanks, merging to master

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants